Marcus Rohrbach

57 publications

11 venues

H Index 24

Name Venue Year citations
Improving Selective Visual Question Answering by Learning from Your Peers. CVPR 2023 0
Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly. ECCV 2022 2
Learn2Augment: Learning to Composite Videos for Data Augmentation in Action Recognition. ECCV 2022 1
Learning To Recognize Procedural Activities with Distant Supervision. CVPR 2022 7
CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition. ECCV 2022 0
FLAVA: A Foundational Language And Vision Alignment Model. CVPR 2022 0
SMART Frame Selection for Action Recognition. AAAI 2021 0
KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA. CVPR 2021 0
Remembering for the Right Reasons: Explanations Reduce Catastrophic Forgetting. ICLR 2021 0
Adversarial Continual Learning. ECCV 2020 81
In Defense of Grid Features for Visual Question Answering. CVPR 2020 175
TextCaps: A Dataset for Image Captioning with Reading Comprehension. ECCV 2020 72
Learning to Generate Grounded Visual Captions Without Localization Supervision. ECCV 2020 0
Iterative Answer Prediction With Pointer-Augmented Multimodal Transformers for TextVQA. CVPR 2020 0
12-in-1: Multi-Task Vision and Language Representation Learning. CVPR 2020 0
Decoupling Representation and Classifier for Long-Tailed Recognition. ICLR 2020 0
Uncertainty-guided Continual Learning with Bayesian Neural Networks. ICLR 2020 0
Cycle-Consistency for Robust Visual Question Answering. CVPR 2019 113
Towards VQA Models That Can Read. CVPR 2019 240
Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks With Octave Convolution. ICCV 2019 330
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition. CVPR 2019 76
Large-Scale Visual Relationship Understanding. AAAI 2019 0
Probabilistic Neural Symbolic Models for Interpretable Visual Question Answering. ICML 2019 0
CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication. ACL 2019 0
Grounded Video Description. CVPR 2019 0
Graph-Based Global Reasoning Networks. CVPR 2019 0
Adversarial Inference for Multi-Sentence Video Description. CVPR 2019 0
A Dataset for Telling the Stories of Social Media Videos. EMNLP 2018 44
Multimodal Explanations: Justifying Decisions and Pointing to the Evidence. CVPR 2018 287
Exploring the Challenges Towards Lifelong Fact Learning. ACCV 2018 12
Visual Coreference Resolution in Visual Dialog Using Neural Module Networks. ECCV 2018 128
Memory Aware Synapses: Learning What (not) to Forget. ECCV 2018 0
Generating Descriptions with Grounded and Co-referenced People. CVPR 2017 54
Learning to Reason: End-to-End Module Networks for Visual Question Answering. ICCV 2017 475
Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training. ICCV 2017 218
Modeling Relationships in Referential Expressions with Compositional Modular Networks. CVPR 2017 0
Captioning Images with Diverse Objects. CVPR 2017 0
Long-Term Recurrent Convolutional Networks for Visual Recognition and Description. TPAMI 2017 0
Segmentation from Natural Language Expressions. ECCV 2016 212
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding. EMNLP 2016 1218
Generating Visual Explanations. ECCV 2016 509
Commonsense in Parts: Mining Part-Whole Relations from the Web and Image Tags. AAAI 2016 30
Grounding of Textual Phrases in Images by Reconstruction. ECCV 2016 0
Neural Module Networks. CVPR 2016 0
Natural Language Object Retrieval. CVPR 2016 0
Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data. CVPR 2016 0
Ask Your Neurons: A Neural-Based Approach to Answering Questions about Images. ICCV 2015 551
Spatial Semantic Regularisation for Large Scale Object Detection. ICCV 2015 22
A dataset for Movie Description. CVPR 2015 360
Long-term recurrent convolutional networks for visual recognition and description. CVPR 2015 0
Sequence to Sequence - Video to Text. ICCV 2015 0
Transfer Learning in a Transductive Setting. NIPS/NeurIPS 2013 229
Translating Video Content to Natural Language Descriptions. ICCV 2013 334
Script Data for Attribute-Based Recognition of Composite Activities. ECCV 2012 142
A database for fine grained activity detection of cooking activities. CVPR 2012 517
Evaluating knowledge transfer and zero-shot learning in a large-scale setting. CVPR 2011 333
What helps where - and why? Semantic relatedness for knowledge transfer. CVPR 2010 0
Copyright ©2019 Universität Würzburg

Impressum | Privacy | FAQ